Tuple construction from data packets

ABSTRACT

In one approach for processing a data packet, in at least one stage of a plurality of stages of a pipeline circuit, a respective packet field value is extracted from the data packet. In each stage of the plurality of stages, a respective tuple field value is inserted into a respective tuple register of the stage at a respective offset. The respective tuple field value in the at least one stage is based on the respective packet field value. In each stage of the plurality of stages except a last one of the stages, the contents of the respective tuple register of the stage are provided as input to a next one of the stages.

FIELD OF THE INVENTION

The disclosure generally relates to packet processing and buildingtuples from the packets.

BACKGROUND

In some implementations, a network packet processor inputs a stream ofnetwork packets, manipulates the contents of the network packets, andoutputs another stream of modified network packets. The manipulationsmay implement a protocol for processing network packets. For example,the network packet processor may implement a protocol layer of acommunication protocol, and for a high-level packet received from ahigher protocol layer and delivered to a lower protocol layer foreventual transmission on the communication media, the manipulations mayencapsulate the high-level packet within a low-level packet of the lowerprotocol layer.

A common task in processing packets is to form a compact data tuplebased on certain fields of a packet. The data tuple makes processing ofthe assembled data convenient. For example, in a packet classificationtask, certain address fields and/or type fields are extracted from apacket and then used together as a lookup key to determine the class ofthe packet. The particular fields and positions of the fields in thepacket may vary depending on processing functions and protocols.

The data rate at which packets are transmitted presents challenges forprocessing the packets at a rate sufficient to keep pace with the datatransmission rate. In packet processing applications, packets arestreamed word-wise, for example using words that are 512-bits wide andachieving a 100 Gbps data rate. Each packet may be comprised of multiple512-bit words. The fields of a packet that are used in constructing atuple are generally located in different areas of the packet. Thus, thefields of a packet will be available at different discrete times. Thetimes at which the fields become available is not necessarily staticsince packet structures can vary from packet to packet, such as withvariable field sizes.

SUMMARY

A method for processing a data packet includes, in at least one stage ofa plurality of stages of a pipeline circuit, extracting a respectivepacket field value from the data packet. In each stage of the pluralityof stages, a respective tuple field value is inserted into a respectivetuple register of the stage at a respective offset. The respective tuplefield value in the at least one stage is based on the respective packetfield value. In each stage of the plurality of stages except a last oneof the stages, the contents of the respective tuple register of thestage are provided as input to a next one of the stages.

A packet processing circuit includes a plurality of pipeline stages.Each stage includes a field extraction circuit and a tuple constructioncircuit. The field extraction circuit is configured to receive a datapacket and is configurable to extract none or a plurality of packetfield values from the data packet. The tuple construction circuit iscoupled to receive an input tuple and each packet field value from thefield extraction circuit. The tuple construction circuit is configuredto insert a respective tuple field value into the input tuple at arespective offset and output a tuple having the inserted respectivetuple field value. The respective tuple field value is based on the atleast one packet field value.

Other aspects and features will be recognized from consideration of theDetailed Description and Claims, which follow.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects and advantages of the methods and circuits will becomeapparent upon review of the following detailed description and uponreference to the drawings in which:

FIG. 1 shows the content and format of an IP packet;

FIG. 2 shows a first example tuple formed from fields extracted from theIP packet of FIG. 1;

FIG. 3 shows a second example tuple formed from fields extracted fromthe IP packet of FIG. 1;

FIG. 4 shows a circuit diagram of a pipeline circuit having m stages forextracting fields from an input packet and assembling the extractedfields into a tuple;

FIG. 5 shows an example implementation of a tuple construction circuitthat inserts a field value into a tuple in a stage of the pipelinecircuit of FIG. 4;

FIG. 6 shows the steps of a tuple under construction;

FIG. 7 shows the construction of a first tuple field and the mask forthe first tuple field;

FIG. 8 shows the construction of a second tuple field and the mask forthe second tuple field;

FIG. 9 is a flowchart of an example process for constructing a tuple;and

FIG. 10 shows an example programmable integrated circuit (IC) on whichthe circuitry described herein may be implemented.

DETAILED DESCRIPTION OF THE DRAWINGS

To achieve a suitable level of performance and flexibility, it may bedesirable to aggregate field values of packets into tuples at a highdata rate. In addition, it may be desirable to programmably selectfields from data packets and formats of the tuples. In one approach, amethod of processing a data packet includes, in at least one stage ofmultiple stages of a pipeline circuit, extracting a respective packetfield value from the multiple fields of the data packet. In each of thestages, a respective tuple field value is inserted into a respectivetuple register of the stage at a respective offset. In at least onestage in which the value of a field is extracted, the respective tuplefield value is based on the respective packet field value. Depending onapplication requirements, the tuple field value may also be based on oneor more constants or one or more input tuple field values. In each stageexcept the last stage, the contents of the respective tuple register ofthe stage are provided as input to a next one of the stages. With thepipelined approach, a tuple can be produced from an input stream of datapackets in every cycle. With parallel circuitry, multiple tuples couldbe generated.

FIGS. 1, 2, and 3 illustrate an example packet and two alternativetuples generated from data in the packet. Though the example is based onan Internet Protocol (IP) packet, it will be appreciated that theapproaches described herein may be applied to a variety of differentpacket protocols.

FIG. 1 shows the content and format of an IP packet 100. The first six32-bit words are the header of the packet. The source port anddestination port fields are part of the payload of the packet.Additional data in the payload portion and the trailer are notillustrated.

FIG. 2 shows a first example tuple 200 formed from 6 fields extractedfrom the IP packet of FIG. 1. The t field is a type field the value ofwhich is computed based on fields that appear before an IPv4 or IPv6header. For example, the value of the field may be based on a fieldcalled ‘type’ in the Ethernet header as:

If (Ethernet.type==0x800) //IPv4 type code

-   -   Tuple.t=0

Else if (Ethernet.type==0x86dd) //IPv6 type code

-   -   Tuple.t=1

Else

-   -   Tuple.t=0; //default case        The proto field in the tuple 200 corresponds to the protocol        field in the IP header, the srcIP field in the tuple is the        source IP address field from the packet 100, the dstIP field in        the tuple is the destination IP address field from the packet,        the srcPort field is the source port field from the packet, and        the dstPort field is the destination port field from the packet.

FIG. 3 shows a second example tuple 300 formed from the 6 fieldsextracted from the IP packet of FIG. 1. The order of the fields in tuple300 is different from the order of the fields in tuple 200. Theparticular tuple structure depends on the requirements for high-speedprocessing of the tuple. Although the example tuples 200 and 300 showfields of the packet being copied from the packet to the designatedpositions in the tuples, the tuple field values may be computed usingarithmetic or logic functions of combinations of the field values, localvariables, constants, and/or tuple field values from a previous stage ofthe pipeline.

A tuple aggregation circuit is provided to construct tuples in apipelined fashion. FIG. 4 shows a circuit diagram of a pipeline circuit400 having m stages for extracting fields from an input packet andassembling the extracted fields into a tuple. Each stage inserts one ormore tuple field value into the tuple. A full packet is input on line402 and is registered between each of the stages in packet registers404, 406, and 408. Once the processing of a packet is complete in onestage, that packet is passed to the next stage for processing and a newpacket is input. Thus, the stages are processing different packetsconcurrently, or different words of the same packet. The completed tuplefor a packet is output on line 412 as the corresponding packet is outputon line 414.

Each stage of the pipeline circuit 400 includes a field extractioncircuit 420, a constant staging circuit 422, a computation circuit 424,and a tuple construction circuit 426. Programmed control information isinput to the circuit elements for controlling each circuit element. Theprogrammed control information indicates which fields to extract fromthe packet, any constants to be used, the computation to be performed,and offsets and sizes of the tuple field values in the tuple. Theprogrammed control information may be provided via a microprogrammingcontrol store (not shown), for example.

The field extraction circuit 420 is controllable to extract one or morefields from the input packet. For each field to be extracted by thefield extraction circuit, the programmed control information indicatesan offset of the field in the packet and a size of the field. For atuple field value that is not based on a packet field, the input programinformation indicates to the field extraction circuit to not extract anyfields from the packet. Further disclosure of a field extraction circuitis found in the co-pending patent application having Ser. No.13/229,083, entitled, “CIRCUIT AND METHOD FOR EXTRACTING FIELDS FROMPACKETS, by Michael Attig, and assigned to Xilinx, Inc.; the entirecontents of this co-pending application are incorporated by referenceinto this application. The extracted value(s) of the field(s) of thepacket are output by the field extraction circuit and input to thecomputation circuit 424.

The constant staging circuit 422 stages constant values for input to thecomputation circuit 424. The programmed control information input to theconstant staging circuit indicates which constant value, if any, is tobe provided to the computation circuit. Depending on applicationrequirements, multiple constant values may be provided to thecomputation circuit. The programmed control information input to theconstant staging circuit may provide the constant values, oralternatively, reference constant values stored within the constantstaging circuit. The time at which the constant value(s) is provided asinput to the computation circuit coincides with the provision of thefield value(s) as input to the computation circuit.

The computation circuit 424 computes the value of the tuple field to beinserted into the tuple based on registered packet field values,registered constant values, and/or a registered input tuple. Thecomputation circuit may be an arithmetic logic unit that performsarithmetic and/or logic functions on designated operands. Theoperation(s) to be performed may be provided to the computation circuitas executable instructions. The instructions also indicate whichregistered values are the operands. A no-operation-type instruction maybe used to indicate to the computation circuit that a registered valueis to be output without changing its value. The computation circuit mayprovide values for multiple tuple fields depending on applicationrequirements.

The tuple construction circuit 426 inserts the tuple field value(s) fromthe computation circuit 424 into the proper location(s) in thein-process tuple (the tuple being constructed). The offset(s) providedin the programmed control information indicates the proper location(s)of the tuple field value(s). The size(s) provided in the programmedcontrol information indicates the number of bits occupied by the tuplefield value(s). Once the tuple field value(s) is inserted in the tuple,the tuple and packet are forwarded to the next stage in the pipeline.Since packets are streamed word-wise, a tuple does not necessarily haveto wait until the entire packet has been received to proceed to the nextstage. Rather a tuple may be forwarded to the next stage once the wordof the packet having the last needed packet field has been extracted andprocessed to create the tuple field value. If no field is extracted froman input packet to create any tuple field value, the tuple may beforwarded to the next stage at the same time the first packet word isforwarded to the next stage.

FIG. 5 shows an example implementation of a tuple construction circuit500 that inserts a field value into a tuple in a stage of the pipelinecircuit of FIG. 4. The tuple construction circuit performs four maintasks. The first task is to create a mask of the size needed for thetuple field value to be inserted. For example, if the tuple field valueis 16 bits, then a 16-bit mask is generated. Next the tuple field valueand the mask are shifted to align with the proper position in the tuple.In the third task, the mask is applied to the tuple to clear theappropriate bits in the tuple for the tuple field value. The fourth taskis to insert the tuple field value into the tuple.

The data path including elements 502, 506, 536, 542, 554, 562, 564, and566 may be viewed as a mask circuit within the tuple constructioncircuit, and the elements 510, 512, 532, 540, 552, 560, 568, and 572 maybe viewed as a tuple insertion circuit within the tuple constructioncircuit.

The proper size mask is created by selecting a mask word withmultiplexer 502 from mask words having mask sizes that correspond to thedifferent possible sizes of tuple fields. In an example implementation,the mask bits are logic 0 bits and are right aligned in a mask wordhaving logic 1 bits in all other positions. For example, for a tuplefield of size 8 bits, the rightmost 8 bits of the mask word selected byand output from multiplexer 502 are logic 0 bits, and all other bits ofthe selected mask word are logic 1. The tuple field size signal 504selects the proper mask word, and the selected mask word is stored inregister 506.

In parallel with the selection of the mask word, the tuple field valueis input via multiplexer 510 and register 512. Also, the field enablesignal 514 provides the selection of the tuple field value viamultiplexer 510 and the field offset via multiplexer 516. The state ofthe field enable signal is stored in register 518, and the field offsetis stored in register 520. The tuple being constructed is input toregister 522 also in parallel with selection of the mask word.

The mask word and the tuple field value are shifted in two stages. Instage 526, the tuple field value and the mask word are left shifted by anumber of bits indicated by the low-order bits of the field offset 528,and in stage 530 the output of the first shift stage is shifted by anumber of bits indicated by the high-order bits of the field offset. Instage 526, multiplexer 532 selects from inputs in which the tuple fieldvalue has been left shifted by 0 to n−1 bits. The notation “<<x” in thediagram indicates a circuit that left shifts the input by x bits. Theinput tuple field value 534 occupies the low-order (right-most) bits ofthe input word, and the other bits are logic 0. Logic 0 values areshifted in as the tuple field value is left shifted. The mask in themask word is also left shifted, and multiplexer 536 selects the maskword that was shifted by the same number of bits as the tuple fieldvalue. The mask occupies the low-order bits in the input mask word 538,and the other bits are logic 1. Logic 1 bits are shifted in as the maskis left shifted.

The low-order bits of the field offset are used to control theselections by multiplexers 532 and 536. For selecting from words thathave been left shifted from 0 to n−1 bits, bits 0 through log₂ n−1 ofthe field offset are used.

The selected tuple field value is stored in register 540, and theselected mask word is stored in register 542. The tuple, field enablesignal, and field offset are forwarded to registers 544, 546, and 548,respectively, to maintain proper timing within the pipeline and allowthe next tuple and tuple field value to be processed.

In stage 530, the tuple field value and the mask are left shifted by anumber of bits specified by the high-order bits of the field offset. Instage 530, multiplexer 552 selects from inputs in which the tuple fieldvalue has been left shifted by 0, n, 2n, . . . n(n−1) bits, andmultiplexer 554 selects from inputs in which the mask has been leftshifted by 0, n, 2n, . . . n(n−1) bits. For the tuple field value, logic0 bits are shifted in, and for the mask word, logic 1 bits are shiftedin. For selecting from words that have been left shifted from 0, n, 2n,. . . n(n−1) bits, bits log₂n through 2 log₂ n−1 of the field offset areused. The tuple, field enable signal, selected tuple field value, andselected mask word are stored in registers 556, 558, 560, and 562,respectively.

The tuple from register 556 and the mask word from register 562 areinput to AND circuit 564, which clears the bits in the tuple for thetuple field value to be inserted. The output is stored in register 566,and in parallel, the tuple field value from register 560 is stored inregister 568, and the field enable signal is stored in register 570. Thetuple with the cleared bits from register 566 and the tuple field valuefrom register 568 are input to OR circuit 572, which outputs the tuplewith the tuple field value inserted at the proper offset in the tuple.The tuple is stored in register 574, and in parallel, the field enablesignal is forwarded for storage in register 576. The tuple is then readyfor the next stage (if any) of the pipeline circuit 400 of FIG. 4. Thefield enable signal indicates availability of the tuple having the tuplefield value inserted.

Multiple tuple fields may be inserted into a tuple in parallel in anexample implementation. For each tuple field value to be inserted, thecircuitry for shifting the tuple field value and constructing andshifting a mask would be replicated. The dashed line 578 input to ANDcircuit 564 represents the mask word having the shifted mask for theadditional tuple field value. The dashed line 580 input to OR circuit572 represents the additional shifted tuple field value.

FIGS. 6, 7, and 8 show an example in which two tuple field values areinserted into a tuple. FIG. 6 shows the tuple under construction insteps 0-5; FIG. 7 shows the construction of tuple field 1 and the maskfor tuple field 1; and FIG. 8 shows the construction of tuple field 2and the mask for tuple field 2. In Step 0, the initial input tuple isspecified as having srcPort set to 0xFFFF and dstPort set to 0x8888,while all other tuple fields are initialized to 0. There are two tuplefield values to insert into the tuple. The first tuple field value(field 1) to be inserted is the value 0x06, which is 8 bits, and is tobe placed at offset 96 (from the least significant bit position) in thetuple. The second tuple field value (field 2) to be inserted is thevalue 0x0032, which is 16 bits, and is to be placed at offset 16. Thus,the first tuple field value is to be inserted as the proto tuple field,and the second tuple field value is to be inserted as the srcPort tuplefield.

In Step 1, the masks for fields 1 and 2 are constructed. This involvescreating a mask of 0xFFFF FFFF FFFF FFFF FFFF FFFF FF00 for field 1 anda mask of 0xFFFF FFFF FFFF FFFF FFFF FFFF 0000 for field 2. Note thatthe first mask clears 8 bits while the second mask clears 16 bits.

In Step 2, the fields and masks are aligned to the appropriate positionin the tuple being constructed by using the appropriate offset for theinput field. The aligned field and mask values for field 1 are 0x00060000 0000 0000 0000 0000 0000 and 0xFF00 FFFF FFFF FFFF FFFF FFFF FFFF,respectively. The aligned field and mask values for field 2 are 0x00000000 0000 0000 0000 0032 0000 and 0xFFFF FFFF FFFF FFFF FFFF 0000 FFFF,respectively.

In Step 3 the masks are applied to the input tuple. This results in achange to the value held in the srcPort from 0xFFFF to 0x0000. There isno change to the proto field, because it was already at 0x00.

In Step 4 the new fields are inserted. This results in a value of 0x06for the proto field and 0x0032 for the srcPort field.

In Step 5 the result is output. The final tuple value is 0x006 0000 00000000 0000 0032 8888.

FIG. 9 is a flowchart of an example process for constructing a tuple.The processing of blocks 604-614 is performed in each stage of the tuplepipeline as indicated by block 602. At block 604 a packet is input.Depending on the application and particular tuple to be constructed, thepacket may contain data to use in generating a tuple field value. Atuple is input at block 606. The tuple that is input depends on thestage of the pipeline and on the application. For example, for the firststage of the pipeline, the input tuple may be a tuple with initializedvalues or a tuple input from elsewhere in a packet processing system.For other stages, the input tuple is the tuple from the previous stageof the tuple construction pipeline.

At block 608, the data for generating a tuple field value is obtained.As described above, the data may be one or more fields extracted fromthe input packet, one or more constant values, or one or more inputtuple field values. The tuple field value is generated at block 610. Thetuple field value may be an arithmetic or logic function of one or morepacket field values, one or more constants, and/or one or more inputtuple field values.

The tuple field value is inserted into the tuple at block 612, and thetuple is output at block 614. For stages other than the final stage, thetuple is output for processing by the next stage in the tupleconstruction pipeline, and for the final stage, the tuple is output fromthe pipeline.

FIG. 10 shows an example programmable integrated circuit (IC) on whichthe circuitry described herein may be implemented. The programmable ICof FIG. 10 is an FPGA. FPGAs can include several different types ofprogrammable logic blocks in the array. For example, FIG. 10 illustratesan FPGA architecture (700) that includes a large number of differentprogrammable tiles including multi-gigabit transceivers (MGTs 701),configurable logic blocks (CLBs 702), random access memory blocks (BRAMs703), input/output blocks (IOBs 704), configuration and clocking logic(CONFIG/CLOCKS 705), digital signal processing blocks (DSPs 706),specialized input/output blocks (I/O 707), for example, e.g., clockports, and other programmable logic 708 such as digital clock managers,analog-to-digital converters, system monitoring logic, and so forth.Some FPGAs also include dedicated processor blocks (PROC 710) andinternal and external reconfiguration ports (not shown).

In some FPGAs, each programmable tile includes a programmableinterconnect element (INT 711) having standardized connections to andfrom a corresponding interconnect element in each adjacent tile.Therefore, the programmable interconnect elements taken togetherimplement the programmable interconnect structure for the illustratedFPGA. The programmable interconnect element INT 711 also includes theconnections to and from the programmable logic element within the sametile, as shown by the examples included at the top of FIG. 10.

For example, a CLB 702 can include a configurable logic element CLE 712that can be programmed to implement user logic plus a singleprogrammable interconnect element INT 711. A BRAM 703 can include a BRAMlogic element (BRL 713) in addition to one or more programmableinterconnect elements. Typically, the number of interconnect elementsincluded in a tile depends on the width of the tile. In the picturedFPGA, a BRAM tile has the same width as five CLBs, but other numbers(e.g., four) can also be used. A DSP tile 706 can include a DSP logicelement (DSPL 714) in addition to an appropriate number of programmableinterconnect elements. An 10B 704 can include, for example, twoinstances of an input/output logic element (IOL 715) in addition to oneinstance of the programmable interconnect element INT 711. As will beclear to those of skill in the art, the actual I/O pads connected, forexample, to the I/O logic element 715 are manufactured using metallayered above the various illustrated logic blocks, and typically arenot confined to the area of the input/output logic element 715.

In the pictured FPGA, a horizontal area near the center of the die(shown shaded in FIG. 10) is used for configuration, clock, and othercontrol logic. Vertical areas 709 extending from this horizontal areaare used to distribute the clocks and configuration signals across thebreadth of the FPGA.

Some FPGAs utilizing the architecture illustrated in FIG. 10 includeadditional logic blocks that disrupt the regular row structure making upa large part of the FPGA. The additional logic blocks can beprogrammable blocks and/or dedicated logic. For example, the processorblock PROC 710 shown in FIG. 10 spans several rows of CLBs and BRAMs.

Note that FIG. 10 is intended to illustrate only an exemplary FPGAarchitecture. The numbers of logic blocks in a row, the relative heightsof the rows, the number and order of rows, the types of logic blocksincluded in the rows, the relative sizes of the logic blocks, and theinterconnect/logic implementations included at the top of FIG. 10 arepurely exemplary. For example, in an actual FPGA more than one adjacentrow of CLBs is typically included wherever the CLBs appear, tofacilitate the efficient implementation of user logic.

The methods and circuits are thought to be applicable to a variety ofsystems for constructing tuples. Other aspects and features will beapparent to those skilled in the art from consideration of thespecification. The processes and circuits may be implemented as one ormore processors configured to execute software, as an applicationspecific integrated circuit (ASIC), or as a logic on a programmablelogic device. It is intended that the described features and aspects beconsidered as examples only, with a true scope of the invention beingindicated by the following claims.

What is claimed is:
 1. A method of processing a data packet, comprising:in at least one stage of a plurality of stages of a pipeline circuit,extracting a respective packet field value from the data packet; in eachstage of the plurality of stages: inputting an in-process tuple into arespective tuple register; inputting a respective programmable offsetvalue; creating in a mask register, a mask word having a subset of bitsequal in number to a number of bits of a respective tuple field valueand positioned in the mask word in response to the respectiveprogrammable offset value; clearing by a first circuit, bits of thein-process tuple in the respective tuple register using the subset ofbits in the mask word in the mask register; inserting by a secondcircuit, the respective tuple field value based on the respective packetfield value into the respective tuple register of the stage by replacingthe cleared bits of the respective in-process tuple with the tuple fieldvalue; and in each stage of the plurality of stages except a last one ofthe stages, providing contents of the respective tuple register of thestage as input to a next one of the stages.
 2. The method of claim 1,further comprising, computing the respective tuple field value in the atleast one stage as a function of the respective packet field value. 3.The method of claim 2, further comprising: inputting a respective set ofone or more constants to the at least one stage; and computing therespective tuple field value in the at least one stage as a function ofthe respective packet field value and the respective set of one or moreconstants.
 4. The method of claim 1, wherein: the extracting of therespective packet field value from the data packet in the at least onestage includes, extracting a respective set that includes two or morepacket field values from the data packet; and the inserting of therespective tuple field value into a respective tuple register in the atleast one stage includes inserting the respective tuple field valuebased on the respective set of two or more packet field values.
 5. Themethod of claim 1, further comprising, in at least one stage of theplurality of stages, inserting two tuple field values into therespective tuple register in parallel.
 6. The method of claim 1, furthercomprising, computing the respective tuple field value in the at leastone stage as a function of the respective packet field value and atleast one tuple field value of the input from a previous one of theplurality of stages.
 7. The method of claim 1, further comprising:inputting a respective set of one or more constants to the at least onestage; and computing the respective tuple field value in the at leastone stage as a function of the respective packet field value, at leastone tuple field value of the input from a previous one of the pluralityof stages, and the respective set of one or more constants.
 8. Themethod of claim 1, further comprising inputting a respectiveprogrammable field size indicative of a number of bits of the respectivetuple field value.
 9. The method of claim 1, wherein the creating themask word includes: selecting a mask word having the subset of bits inright-most bits of the mask word and storing the selected mask word inthe mask register; and shifting bits of the mask word a number ofpositions indicated by the programmable offset value.
 10. A packetprocessing circuit, comprising: a plurality of pipeline stages, eachstage including: a field extraction circuit configured to receive a datapacket and configurable to extract none or a plurality of packet fieldvalues from the data packet; and a tuple construction circuit coupled toreceive an input tuple, a respective programmable offset value, and eachpacket field value from the field extraction circuit, the tupleconstruction circuit configured to insert a respective tuple field valuebased on the received packet field values into the input tuple at arespective offset and output a tuple having the inserted respectivetuple field value; wherein each tuple construction circuit includes: afirst circuit configured to: create a mask word in a mask registerhaving a subset of bits equal in number to a number of bits of therespective tuple field value and positioned in the mask word in responseto the respective programmable offset value, and clear bits of the inputtuple using the subset of bits in the mask word; and a second circuitconfigured to replace the cleared bits of the input tuple with therespective tuple field value.
 11. The circuit of claim 10, wherein eachstage further comprises a computation circuit coupled to the fieldextraction circuit, the computation circuit configured to compute therespective tuple field value as a function of the packet field values.12. The circuit of claim 11, wherein each stage further comprises: aconstant staging circuit coupled to the computation circuit, theconstant staging circuit configured to input a respective set of one ormore constants; wherein the computation circuit is configured to computethe respective tuple field value as a function of the packet fieldvalues and the respective set of one or more constants.
 13. The circuitof claim 10, wherein: the field extraction circuit is further configuredto extract a respective set that includes two or more packet fieldvalues from the data packet; and the tuple construction circuit isfurther configured to insert the respective tuple field value based onthe respective set of two or more packet field values.
 14. The circuitof claim 10, wherein the tuple construction circuit in at least onestage of the plurality of stages is further configured to insert twotuple field values into the input tuple in parallel.
 15. The circuit ofclaim 10, wherein each stage further comprises a computation circuitcoupled to the field extraction circuit, the computation circuitconfigured to compute the respective tuple field value as a function ofthe packet field values and at least one tuple field value of the inputtuple.
 16. The circuit of claim 10, further comprising: a computationcircuit coupled to the field extraction circuit; and a constant stagingcircuit coupled to the computation circuit, the constant staging circuitconfigured to input a respective set of one or more constants; whereinthe computation circuit is configured to compute the respective tuplefield value as a function of the packet field values, at least one tuplefield value of the input tuple, and the respective set of one or moreconstants.
 17. The circuit of claim 10, wherein each tuple constructioncircuit is responsive to a respective programmable tuple field sizeindicative of a number of bits of the respective tuple field value. 18.The packet processing circuit of claim 10, wherein the mask circuit isfurther configured to: select a mask word having the subset of bits inright-most bits of the mask word; store the selected mask word in themask register; and shift bits of the mask word a number of positionsindicated by the programmable offset value.